Search CORE

101 research outputs found

Parameter Estimation of Complex Systems from Sparse and Noisy Data

Author: Chu Yunfei
Publication venue
Publication date
Field of study

Mathematical modeling is a key component of various disciplines in science and engineering. A mathematical model which represents important behavior of a real system can be used as a substitute for the real process for many analysis and synthesis tasks. The performance of model based techniques, e.g. system analysis, computer simulation, controller design, sensor development, state filtering, product monitoring, and process optimization, is highly dependent on the quality of the model used. Therefore, it is very important to be able to develop an accurate model from available experimental data. Parameter estimation is usually formulated as an optimization problem where the parameter estimate is computed by minimizing the discrepancy between the model prediction and the experimental data. If a simple model and a large amount of data are available then the estimation problem is frequently well-posed and a small error in data fitting automatically results in an accurate model. However, this is not always the case. If the model is complex and only sparse and noisy data are available, then the estimation problem is often ill-conditioned and good data fitting does not ensure accurate model predictions. Many challenges that can often be neglected for estimation involving simple models need to be carefully considered for estimation problems involving complex models. To obtain a reliable and accurate estimate from sparse and noisy data, a set of techniques is developed by addressing the challenges encountered in estimation of complex models, including (1) model analysis and simplification which identifies the important sources of uncertainty and reduces the model complexity; (2) experimental design for collecting information-rich data by setting optimal experimental conditions; (3) regularization of estimation problem which solves the ill-conditioned large-scale optimization problem by reducing the number of parameters; (4) nonlinear estimation and filtering which fits the data by various estimation and filtering algorithms; (5) model verification by applying statistical hypothesis test to the prediction error. The developed methods are applied to different types of models ranging from models found in the process industries to biochemical networks, some of which are described by ordinary differential equations with dozens of state variables and more than a hundred parameters

Texas A&M Repository

Synaesthesia in Chinese: A corpus-based study on gustatory adjectives in Mandarin

Author: Huang Chu-Ren
Long Yunfei
Zhao Qingqing
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 28/08/2018
Field of study

This study adopted a corpus-based approach to examine the synaesthetic metaphors of gustatory adjectives in Mandarin. Based on the distribution of synaesthetic uses in the corpus, we found that: (1) the synaesthetic metaphors of Mandarin gustatory adjectives exhibited directionality; (2) the directionality of Mandarin synaesthetic gustatory adjectives showed both commonality and specificity when compared with the attested directionality of gustatory adjectives in English, which calls for a closer re-examination of the claim of cross-lingual universality of synaesthetic tendencies; and (3) the distribution and directionality of Mandarin synaesthetic gustatory adjectives could not be predicted by a single hypothesis, such as the embodiment-driven approach or the biological association-driven approach. Thus, linguistic synaesthesia was constrained by both the embodiment principle and the biological association mechanism

University of Essex Research Repository

Crossref

Iterative algorithm for lane reservation problem on transportation network

Author: Che Ada
Chu Feng
Fang Yunfei
Mammar Said
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/04/2011
Field of study

International audienceIn this paper, we study an NP-hard lane reservation problem on transportation network. By selecting lanes to be reserved on the existing transportation network under some special situations, the transportation tasks can be accomplished on the reserved lanes with satisfying the condition of time or safety. Lane reservation strategy is a flexible and economic method for traffic management. However, reserving lanes has impact on the normal traffic because the reserved lanes can only be passed by the special tasks. It should be well considered choosing reserved lanes to minimize the total traffic impact when applying the lane reservation strategy for the transportation tasks. In this paper, an integer linear program model is formulated for the considered problem and an optimal algorithm based on the cut-and-solve method is proposed. Some new techniques are developed for the cut-and-solve method to accelerate the convergence of the proposed algorithm. Numerical computation results of 125 randomly generated instances show that the proposed algorithm is much faster than a MIP solver of commercial software CPLEX 12.1 to find optimal solutions on average computing time

HAL Evry

Improving attention model based on cognition grounded data for sentiment analysis

Author: Huang Chu-Ren
Li Minglei
Long Yunfei
Lu Qin
Xiang Rong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Attention models are proposed in sentiment analysis and other classification tasks because some words are more important than others to train the attention models. However, most existing methods either use local context based information, affective lexicons, or user preference information. In this work, we propose a novel attention model trained by cognition grounded eye-tracking data. First,a reading prediction model is built using eye-tracking data as dependent data and other features in the context as independent data. The predicted reading time is then used to build a cognition grounded attention layer for neural sentiment analysis. Our model can capture attentions in context both in terms of words at sentence level as well as sentences at document level. Other attention mechanisms can also be incorporated together to capture other aspects of attentions, such as local attention, and affective lexicons. Results of our work include two parts. The first part compares our proposed cognition ground attention model with other state-of-the-art sentiment analysis models. The second part compares our model with an attention model based on other lexicon based sentiment resources. Evaluations show that sentiment analysis using cognition grounded attention model outperforms the state-of-the-art sentiment analysis methods significantly. Comparisons to affective lexicons also indicate that using cognition grounded eye-tracking data has advantages over other sentiment resources by considering both word information and context information. This work brings insight to how cognition grounded data can be integrated into natural language processing (NLP) tasks

University of Essex Research Repository

Crossref

Mandarin Chinese modality exclusivity norms

Author: Chen I-Hsuan
Huang Chu-Ren
Long Yunfei
Lu Qin
Zhao Qingqing
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

Modality exclusivity norms have been developed in different languages for research on the relationship between perceptual and conceptual systems. This paper sets up the first modality exclusivity norms for Chinese, a Sino-Tibetan language with semantics as its orthographically relevant level. The norms are collected through two studies based on Chinese sensory words. The experimental designs take into consideration the morpho-lexical and orthographic structures of Chinese. Study 1 provides a set of norms for Mandarin Chinese single-morpheme words in mean ratings of the extent to which a word is experienced through the five sense modalities. The degrees of modality exclusivity are also provided. The collected norms are further analyzed to examine how sub-lexical orthographic representations of sense modalities in Chinese characters affect speakers’ interpretation of the sensory words. In particular, we found higher modality exclusivity rating for the sense modality explicitly represented by a semantic radical component, as well as higher auditory dominant modality rating for characters with transparent phonetic symbol components. Study 2 presents the mean ratings and modality exclusivity of coordinate disyllabic compounds involving multiple sense modalities. These studies open new perspectives in the study of modality exclusivity. First, links between modality exclusivity and writing systems have been established which has strengthened previous accounts of the influence of orthography in the processing of visual information in reading. Second, a new set of modality exclusivity norms of compounds is proposed to show the competition of influence on modality exclusivity from different linguistic factors and potentially allow such norms to be linked to studies on synesthesia and semantic transparency

University of Essex Research Repository

PolyU Institutional Repository

Directory of Open Access Journals

FigShare

Learning Heterogeneous Network Embedding From Text and Links

Author: Bi Chenglin
Huang Chu-Ren
Li Minglei
Long Yunfei
Lu Qin
Xiang Rong
Xiong Dan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Finding methods to represent multiple types of nodes in heterogeneous networks is both challenging and rewarding, as there is much less work in this area compared with that of homogeneous networks. In this paper, we propose a novel approach to learn node embedding for heterogeneous networks through a joint learning framework of both network links and text associated with nodes. A novel attention mechanism is also used to make good use of text extended through links to obtain much larger network context. Link embedding is first learned through a random-walk-based method to process multiple types of links. Text embedding is separately learned at both sentence level and document level to capture salient semantic information more comprehensively. Then, both types of embeddings are jointly fed into a hierarchical neural network model to learn node representation through mutual enhancement. The attention mechanism follows linked edges to obtain context of adjacent nodes to extend context for node representation. The evaluation on a link prediction task in a heterogeneous network data set shows that our method outperforms the current state-of-the-art method by 2.5%-5.0% in AUC values with p-value less than 10 -9 , indicating very significant improvement

University of Essex Research Repository

Crossref

PolyU Institutional Repository

Biostratigraphic correlation and mass extinction during the Permian-Triassic transition in terrestrial-marine siliciclastic settings of South China

Author: Benton Michael J.
Chu Daoliang
Huang Yunfei
Song Haijun
Song Ting
Tian Li
Tong Jinnan
Yu Jianxin
Publication venue: 'Elsevier BV'
Publication date: 01/09/2016
Field of study

Explore Bristol Research

Affection Driven Neural Networks for Sentiment Analysis

Author: Gu Jinghang
Huang Chu-Ren
Long Yunfei
Lu Qin
Wan Mingyu
Xiang Rong
Publication venue: European Language Resources Association
Publication date: 01/05/2020
Field of study

Deep neural network models have played a critical role in sentiment analysis with promising results in the recent decade. One of the essential challenges, however, is how external sentiment knowledge can be effectively utilized. In this work, we propose a novel affection-driven approach to incorporating affective knowledge into neural network models. The affective knowledge is obtained in the form of a lexicon under the Affect Control Theory (ACT), which is represented by vectors of three-dimensional attributes in Evaluation, Potency, and Activity (EPA). The EPA vectors are mapped to an affective influence value and then integrated into Long Short-term Memory (LSTM) models to highlight affective terms. Experimental results show a consistent improvement of our approach over conventional LSTM models by 1.0% to 1.5% in accuracy on three large benchmark datasets. Evaluations across a variety of algorithms have also proven the effectiveness of leveraging affective terms for deep model enhancement

University of Essex Research Repository

Lexical data augmentation for sentiment analysis

Author: Chersoni Emmanuele
Huang Chu‐Ren
Li Wenjie
Long Yunfei
Lu Qin
Xiang Rong
Publication venue: 'Wiley'
Publication date: 17/06/2021
Field of study

Machine learning methods, especially deep learning models, have achieved impressive performance in various natural language processing tasks including sentiment analysis. However, deep learning models are more demanding for training data. Data augmentation techniques are widely used to generate new instances based on modifications to existing data or relying on external knowledge bases to address annotated data scarcity, which hinders the full potential of machine learning techniques. This paper presents our work using part-of-speech (POS) focused lexical substitution for data augmentation (PLSDA) to enhance the performance of machine learning algorithms in sentiment analysis. We exploit POS information to identify words to be replaced and investigate different augmentation strategies to find semantically related substitutions when generating new instances. The choice of POS tags as well as a variety of strategies such as semantic-based substitution methods and sampling methods are discussed in detail. Performance evaluation focuses on the comparison between PLSDA and two previous lexical substitution-based data augmentation methods, one of which is thesaurus-based, and the other is lexicon manipulation based. Our approach is tested on five English sentiment analysis benchmarks: SST-2, MR, IMDB, Twitter, and AirRecord. Hyperparameters such as the candidate similarity threshold and number of newly generated instances are optimized. Results show that six classifiers (SVM, LSTM, BiLSTM-AT, bidirectional encoder representations from transformers [BERT], XLNet, and RoBERTa) trained with PLSDA achieve accuracy improvement of more than 0.6% comparing to two previous lexical substitution methods averaged on five benchmarks. Introducing POS constraint and well-designed augmentation strategies can improve the reliability of lexical data augmentation methods. Consequently, PLSDA significantly improves the performance of sentiment analysis algorithms

University of Essex Research Repository

Crossref

PolyU Institutional Repository